We highligh several efficiencies
- R Markdown is our one-stop-shop solution
- R Markdown is inherently reproducible
- R & R Markdown calibrated for collaboration
- R & R Markdown are free
- BIG Thank you to Project TIER and the Alfred P. Sloan Foundation
EEA, 2 March 2019
We highligh several efficiencies
St. Lawrence U.
R Studio IDE
Courtesy of Bray, 2016
Good
Bad
Ugly?
Courtesy of Bray, 2016
Good
Bad
Ugly?
# Header 1 ## Header 2 ### Header 3 This is normal sized text used in the body of our work. For bullet points, we use dashes, e.g. - Intro to RStudio - More content - a sub-point - Back to the original level
R Markdown can produce a variety of document types (other than the default html page):
pdf_document makes a PDF with LaTeX (.pdf)
word_document for Microsoft Word documents (.docx).
odt_document for OpenDocument Text documents (.odt).
rtf_document for Rich Text Format documents (.rtf)
And others.
R Markdown can also be re-purposed to produce a presentation file (as with this presentation):
io_slides opens in your browser and interactive (.html)
slidy another browser based presentation format (.html)
beamer makes a PDF with LaTeX (.pdf)
Think about data analysis as falling into three loose categories:
All of this occurs in the code "chunk"
To open a code chunk hit CMD + OPTION + I on a Mac
Or type out three backticks ``` folowed by {r}
And then three more back ticks ``` on another line.
Within the {r} you can specify options, like {eval = FALSE} if you don't want it to evaluate the code
Or you can label the code chunk, e.g. {r cars} labels the chunk "cars" in your ToC
```{r cars, echo = TRUE}
summary(cars)
```
The option echo = TRUE means that the code gets included in the rendered html.
summary(cars)
## speed dist ## Min. : 4.0 Min. : 2.00 ## 1st Qu.:12.0 1st Qu.: 26.00 ## Median :15.0 Median : 36.00 ## Mean :15.4 Mean : 42.98 ## 3rd Qu.:19.0 3rd Qu.: 56.00 ## Max. :25.0 Max. :120.00
R Markdown and R Studio together have excellent capabilities.
## session subject r1 r2 r3 r4 r5 r6 r7 r8 r9 treatment team ## 1 1 1 0 0 0 0 10 10 0 0 0 individual NA ## 2 1 2 0 0 30 40 40 0 0 0 20 individual NA ## 3 1 3 30 30 0 0 0 60 60 10 0 individual NA ## 4 1 4 20 0 100 0 0 30 75 100 100 individual NA ## 5 1 5 100 100 100 100 100 100 100 100 100 individual NA ## 6 1 6 100 100 100 100 100 100 100 0 0 individual NA ## uniqid ## 1 1_individual_1 ## 2 1_individual_2 ## 3 1_individual_3 ## 4 1_individual_4 ## 5 1_individual_5 ## 6 1_individual_6
## ## Wilcoxon rank sum test with continuity correction ## ## data: value by treatment ## W = 52876, p-value = 3.838e-10 ## alternative hypothesis: true location shift is not equal to 0
## ## Call: ## lm(formula = value ~ treatment, data = SutNarrow) ## ## Residuals: ## Min 1Q Median 3Q Max ## -61.370 -29.385 -0.542 38.630 60.615 ## ## Coefficients: ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) 39.385 1.451 27.152 < 2e-16 *** ## treatmentmessage 21.985 1.994 11.028 < 2e-16 *** ## treatmentmixed 10.609 1.925 5.510 3.92e-08 *** ## treatmentpaycomm 10.886 2.144 5.077 4.09e-07 *** ## treatmentteamtreat 16.313 2.629 6.204 6.34e-10 *** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## Residual standard error: 34.81 on 2713 degrees of freedom ## Multiple R-squared: 0.04473, Adjusted R-squared: 0.04333 ## F-statistic: 31.76 on 4 and 2713 DF, p-value: < 2.2e-16
## Oneway (time) effect Random Effect Model
## (Swamy-Arora's transformation)
##
## Call:
## plm(formula = value ~ treatment, data = SutNarrow, effect = "time",
## model = "random", index = c("uniqid"))
##
## Balanced Panel: n = 302, T = 9, N = 2718
##
## Effects:
## var std.dev share
## idiosyncratic 1197.631 34.607 0.987
## time 16.062 4.008 0.013
## theta: 0.555
##
## Residuals:
## Min. 1st Qu. Median 3rd Qu. Max.
## -61.9325 -28.6639 -2.5889 33.7276 64.3014
##
## Coefficients:
## Estimate Std. Error z-value Pr(>|z|)
## (Intercept) 39.3854 1.9657 20.0367 < 2.2e-16 ***
## treatmentmessage 21.9850 1.9818 11.0936 < 2.2e-16 ***
## treatmentmixed 10.6093 1.9140 5.5430 2.973e-08 ***
## treatmentpaycomm 10.8862 2.1315 5.1072 3.270e-07 ***
## treatmentteamtreat 16.3130 2.6138 6.2412 4.342e-10 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Total Sum of Squares: 3403100
## Residual Sum of Squares: 3249200
## R-Squared: 0.045244
## Adj. R-Squared: 0.043836
## Chisq: 128.563 on 4 DF, p-value: < 2.22e-16
| Dependent variable: | |
| value | |
| treatmentmessage | 21.985*** |
| (1.994) | |
| treatmentmixed | 10.609*** |
| (1.925) | |
| treatmentpaycomm | 10.886*** |
| (2.144) | |
| treatmentteamtreat | 16.313*** |
| (2.629) | |
| Constant | 39.385*** |
| (1.451) | |
| Observations | 2,718 |
| R2 | 0.045 |
Econometrics
Senior Seminar in Urban Economics
Micro
Behavioral Economics
Special Studies
Business Analytics
Honors theses
Michael:
Students will only learn commands through graded assignments
Aaron:
Students can struggle with basic computing (working directory, etc.)
Students have to adjust to getting the Basics Right
Students know WYSIWYG
Installing packages
Chrome extensions
R Studio Server = GOOD & free
How about Bayes' Rule?
\[Pr(\mbox{Outcome} | \mbox{signal}) = \frac{\theta p}{\theta p - (1 - \theta)(1 - p)}\]
R Markdown uses \(\LaTeX\) for math and it immediately gets displayed in R Studio.
That is, \(\LaTeX\) without the challenges of learning the packages, tables, etc that makes learning \(\LaTeX\) so hard.
In-line equations are bracketed by single dollar signs $.
Off-set equations are bracketed by double dollar signs $$.
Credit to people from whom we pilfer some content (and recognize their contribution)
New to R?